Correlation-Based Refinement of Rules with Numerical Attributes

نویسندگان

  • André Melo
  • Martin Theobald
  • Johanna Völker
چکیده

Learning rules is a common way of extracting useful information from knowledge or data bases. Many of such data sets contain numerical attributes. However, approaches like Inductive Logic Programming (ILP) or association rule mining are optimized for data with categorical values, and considering numerical attributes is expensive. In this paper, we present an extension to the top-down ILP algorithm, which enables an efficient discovery of datalog rules from data with both numerical and categorical attributes. Our approach comprises a preprocessing phase for computing the correlations between numerical and categorical attributes, as well as an extension to the ILP refinement step, which enables us to detect interesting candidate rules and to suggest refinements with relevant attribute combinations. We report on experiments with U.S. Census data, Freebase and DBpedia, and show that our approach helps to efficiently discover rules with numerical intervals.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Action Rules Discovery Based on Tree Classifiers and Meta-actions

Action rules describe possible transitions of objects from one state to another with respect to a distinguished attribute. Early research on action rule discovery usually required the extraction of classification rules before constructing any action rule. Newest algorithms discover action rules directly from a decision system. To our knowledge, all these algorithms assume that all attributes ar...

متن کامل

Derived fuzzy importance of attributes based on the weakest triangular norm-based fuzzy arithmetic and applications to the hotel services

The correlation between the performance of attributes and the overallsatisfaction such as they are perceived by the customers is often used tocalculate the importance of attributes in the crisp case. Recently, the methodwas extended, based on the standard Zadeh extension principle, to the fuzzycase, taking into account the specificity of the human thinking. Thedifficulties of calculation are im...

متن کامل

Faults and fractures detection in 2D seismic data based on principal component analysis

Various approached have been introduced to extract as much as information form seismic image for any specific reservoir or geological study. Modeling of faults and fractures are among the most attracted objects for interpretation in geological study on seismic images that several strategies have been presented for this specific purpose. In this study, we have presented a modified approach of ap...

متن کامل

Retaining Customers Using Clustering and Association Rules in Insurance Industry: A Case Study

This study clusters customers and finds the characteristics of different groups in a life insurance company in order to find a way for prediction of customer behavior based on payment. The approach is to use clustering and association rules based on CRISP-DM methodology in data mining. The researcher could classify customers of each policy in three different clusters, using association rules. A...

متن کامل

Application of CAS wavelet to construct quadrature rules for numerical ‎integration‎‎

In this paper‎, ‎based on CAS wavelets we present quadrature rules for numerical solution‎ ‎of double and triple integrals with variable limits of integration‎. ‎To construct new method‎, ‎first‎, ‎we approximate the unknown function by CAS wavelets‎. ‎Then by using suitable collocation points‎, ‎we obtain the CAS wavelet coefficients that these coefficients are applied in approximating the unk...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014